Concerning the problem that traditional discrete models fail to capture global semantic information of whole comment text in deceptive review detection, a hierarchical neural network model with attention mechanism was proposed. Firstly, different neural network models were adopted to model the structure of text, and which model was able to obtain the best semantic representation was discussed. Then, the review was modeled by two attention mechanisms respectively based on user view and product view. The user view focused on the user's preferences in comment text and the product view focused on the product feature in comment text. Finally, two representations learned from user and product views were combined as final semantic representation for deceptive review detection. The experiments were carried out on Yelp dataset with accuracy as the evaluation indicator. The experimental results show that the proposed hierarchical neural network model with attention mechanism performs the best with the accuracy higher than traditional discrete methods and existing neural benchmark models by 1 to 4 percentage points.
In the process of advertising on search engines, it needs to calculate the correlation between auction word (Bidword) and user's query (Query) in real time. Dynamic Term weight in advertisements and phrase commercial value assessment must be considered in relevant calculation. Thus, a phrase related calculation approach named ADPCB was proposed based on behavioral analysis and Continuous Bag-Of-Words (CBOW) model to deal with those problems. Firstly, this approach got vector of each Term by CBOW. Secondly, to analyze advertiser's behavior and construct a global empowerment tree about phrases, the phrase structure was analyzed to obtain dynamic Term weight. Finally the phrase distributed representation produced by Term weight and linear combination was applied to the related measurement between Bidword and Query. Experiments were conducted on 10000 pairs Query and Bidword (positive and negative ratio is 1∶〖KG-*2〗1) with editorial judgments by using Word2vec, ADPCB performed better than Term Frequency-Inverse Document Frequency (TF-IDF) which combined with CBOW; when the accuracy was 0.70, ADPCB got higher recall than that of Latent Dirichlet Allocation (LDA), BM25 (Best Match25) and TF-IDF. The experimental results and analysis show that ADPCB can recognize the commercial value quality of the phrase to reduce the quantity of advertising trigger of low commercial value Query, it can be used in real-time calculation scene.
This paper addressed to select the most interesting and useful comments for an online news article. In summary of comments for news extraction problem, a new way was introduced, and it was proved to be effective in the social media comments automatic extraction with the combination of Weighed Textual Matrix Factorization (WTMF) and information entropy. The construction of information for tweets and news was based on heterogeneous graph WTMF model which solved the sparse problems of short text and maintained the similarity of information. Meanwhile, according to tweet character distribution, binary entropy and continuous entropy were built to guarantee the diversity of information.Last, according to the characteristics of submodularity, a greedy algorithm was designed to get an approximate optimal solution for the optimization problems. The experimental results show that, the method with combination of WTMF and information entropy can improve the extraction performance of summary of comments for social media effectively. The recall rate and F1 value on the ROUGE2 respectively reaches 0.40074 and 0.27330,which is increased by 0.05 and 0.03 in comparison of the Latent Dirichlet Allocation (LDA) extended model—Biterm Topic Model (BTM). The proposed model improves the quality of news summary of comments effectively.